Wie ist die Medienresonanz von Pressemittelungen politischer Parteien?
Assumptions:
Parteien wollen “ihre” Themen in den Medien platzieren, d.h. die Themen, die den - meist programmatisch bestimmten - Kern ihrer Wahlaussage bilden.
Parteien wollen Probleme in der Vordergrund rücken, für die sie nach Ansicht der Bevölkerung insgesamt oder nach Ansicht des eigenen Anhangs die Lösungskompentenz besitzen.
Parteien wollen Themen vermeiden, die aufgrund der aktuellen Sachlage gegen sie sprechen. Stattdessen wollen sie andere Themen (Sachthemen, Personal- und Stilfragen) in den Vordergrund rücken. Instrument hierfür sind Pressemitteilungen der Parteien und Fraktionen.
Parteien möchten, dass ihre Sichtweisen möglichst ungekürzt und unverfälscht publiziert werden.
Parties and candidates not only want to be present in the media (coverage bias), or evaluated in a positive way (tonality bias). They also want the media agenda to be congruent with their own agenda to define the issue-based criteria on which they will be evaluated by voters. Thus, parties choose their issue agenda carefully, highlighting issues that they are perceived to be competent on, that they “own” and that are important to their voters. In that sense agenda bias refers to the extent to which political actors appear in the public domain in conjunction with the topics they wish to emphasize.
To allow for an operationalization of agenda bias, I use parties’ campaign communication as an approximation of the potential universe of news stories (D’Alessio & Allen, 2000; Eberl, 2017). I compare the policy issues addressed in campaign communication (i.e., the party agenda) with the policy issues the parties address in media coverage (i.e., the mediated party agenda).
To discover the latent topics in the corpus of press releases (1.942) and news articles (11.880), a structural topic modeling (STM) developed by Roberts (2016) is applied. The STM is an unsupervised machine learning approach that models topics as multinomial distributions of words and documents as multinomial distributions of topics, allowing to incorporate external variables that effect both, topical content and topical prevalence.
STM assumes a fixed user-specified number of topics. There is not a “right” answer to the number of topics that are appropriate for a given corpus (Grimmer and Stewart 2013), but the function searchK uses a data-driven approach to selecting the number of topics. The function will perform several automated tests to help choose the number of topics including calculating the held out likelihood (Wallach et al. 2009) and performing a residual analysis (Taddy 2012).
I included the document source as a control for the topical topical prevalence, assuming that the distribution of topics depends on the sources. The number of topics is set to 80.
library(stm)
library(tidyverse)
library(ggthemes)
rm(list = ls())
load("../output/models/finalmodel_60_nocontet.RDa")
model_df <- model_df %>%
dplyr::mutate(doc_index = as.numeric(rownames(.)),
source = ifelse(source == "welt.de", "DIE WELT", source),
source = ifelse(source == "zeit.de", "ZEIT ONLINE", source),
source = ifelse(source == "focus.de", "FOCUS Online", source),
source = ifelse(source == "bild.de", "Bild.de", source),
source = ifelse(source == "spiegel.de", "SPIEGEL ONLINE", source),
source = ifelse(source == "union", "Union", source),
source = ifelse(source == "spd", "SPD", source),
source = ifelse(source == "afd", "AfD", source),
source = ifelse(source == "gruene", "Grüne", source),
source = ifelse(source == "linke", "Linke", source),
source = ifelse(source == "fdp", "FDP", source)
)
model_df %>%
ggplot(aes(source, fill=type)) +
geom_bar(show.legend = F, alpha = 0.8) +
coord_flip() +
facet_wrap(~type, scales = "free") +
theme_hc() +
scale_fill_hc() +
labs(title = "Document distribution", y=NULL, x = NULL)
To explore the words associated with each topic we use the words with the highest probability in each topic. As we included the source type (press release or news paper) as a control for the topical content (the word distribution of each topic), we have two different labels for each topic.
sagelabs <- sageLabels(stmOut)
## Without Content Covariate ##
topics.df <- as.data.frame(sagelabs$cov.betas[[1]]$problabels) %>%
transmute(topic = as.numeric(rownames(.)),
joint_label = paste( "Topic",topic, ":", V1,V2,V3,V4))
topics.df %>% select(joint_label) %>%
htmlTable::htmlTable(align="l", header = c("Topic Label"),
rnames = F)
| Topic Label |
|---|
| Topic 1 : fdp koalition grünen spd |
| Topic 2 : spd schulz gabriel nahles |
| Topic 3 : eu erklärt fraktion deutschland |
| Topic 4 : the of to is |
| Topic 5 : diesel deutschland deutschen autos |
| Topic 6 : trump gipfel hamburg polizei |
| Topic 7 : merkel deutschland regierung spd |
| Topic 8 : grünen jamaika fdp csu |
| Topic 9 : afd petry partei fraktion |
| Topic 10 : cdu niedersachsen spd grünen |
| Topic 11 : prozent spd umfrage afd |
| Topic 12 : euro milliarden spd union |
| Topic 13 : daten fragen antworten bundesregierung |
| Topic 14 : bundesregierung menschen deutschland erklärt |
| Topic 15 : merkel kanzlerin angela cdu |
| Topic 16 : afd facebook twitter medien |
| Topic 17 : afd bundestagswahl prozent wahl |
| Topic 18 : csu cdu union seehofer |
| Topic 19 : cdu merkel spahn altmaier |
| Topic 20 : fdp lindner jamaika grünen |
| Topic 21 : spd frauen heil oppermann |
| Topic 22 : schäuble euro finanzminister deutschen |
| Topic 23 : kinder arbeit prozent frauen |
| Topic 24 : kohl helmut kohls kanzler |
| Topic 25 : eu europa deutschland europäischen |
| Topic 26 : welt menschen politik deutschland |
| Topic 27 : grünen özdemir göring eckardt |
| Topic 28 : afd stiftung partei glaser |
| Topic 29 : schwesig spd ministerpräsidentin manuela |
| Topic 30 : spd union koalitionsverhandlungen groko |
| Topic 31 : pflege menschen opfer patienten |
| Topic 32 : schulz spd merkel martin |
| Topic 33 : ge ten be ver |
| Topic 34 : jahr berlin bundesregierung deutschland |
| Topic 35 : palmer talk grünen bosbach |
| Topic 36 : spd koalition union schulz |
| Topic 37 : amri cdu berliner polizei |
| Topic 38 : berliner polizei berlin muslime |
| Topic 39 : afd petry staatsanwaltschaft frauke |
| Topic 40 : bundeswehr soldaten leyen nato |
| Topic 41 : bundestag afd abgeordneten fraktion |
| Topic 42 : russland gabriel deutschland us |
| Topic 43 : hamburg gipfel polizei hamburger |
| Topic 44 : us menschen trump deutschland |
| Topic 45 : cdu schleswig günther laschet |
| Topic 46 : afd gauland özoguz alexander |
| Topic 47 : bundestag gesetz spd abstimmung |
| Topic 48 : wahlkampf guttenberg politik christian |
| Topic 49 : antisemitismus juden berlin deutschland |
| Topic 50 : stadt polizei bürgermeister flüchtlinge |
| Topic 51 : türkei erdogan türkischen deutschland |
| Topic 52 : afd partei höcke pazderski |
| Topic 53 : flüchtlinge deutschland familiennachzug menschen |
| Topic 54 : bildung deutschland bund schulen |
| Topic 55 : weidel afd alice spitzenkandidatin |
| Topic 56 : august spd cdu prozent |
| Topic 57 : linke linken wagenknecht partei |
| Topic 58 : maizière innenminister verfassungsschutz deutschland |
| Topic 59 : csu seehofer söder horst |
| Topic 60 : schmidt glyphosat hendricks spd |
theta <- as.data.frame(stmOut$theta) %>% # get all theta values for each document
mutate(doc_index = as.numeric(rownames(.))) %>%
# convert to long format
gather(topic, theta, -doc_index) %>%
mutate(topic = as.numeric(gsub("V","",topic))) %>%
# join with topic df
left_join(., topics.df, by="topic") %>%
# join with model_df
left_join(., model_df %>%
select(date,type,source,doc_index,title_text), by="doc_index")
For each document, we have a distribution over all topics, e.g.:
sample_doc <- sample(nrow(model_df),1)
# uncomment this to only select docs from press releases
#sample_doc <- theta %>% filter(type=="press") %>% sample_n(1) %>% select(doc_index)
#sample_doc <- sample_doc$doc_index
title <- model_df$title[which(model_df$doc_index == sample_doc)]
source <- model_df$source[which(model_df$doc_index == sample_doc)]
theta %>%
filter(doc_index == sample_doc) %>%
select(doc_index, joint_label, theta) %>%
ggplot(aes(joint_label, theta)) +
geom_col(fill="#0099c6", alpha = 0.8) +
ylim(c(0,1)) +
coord_flip() +
theme_hc() +
labs(title = paste("Topic distribution of document",sample_doc),
subtitle = paste0("Source: ",source,"\nTitle: ", title),
x = NULL, y = NULL
) +
theme(axis.text = element_text(size = 10))
What is the document acutally about?
model_df %>%
filter(doc_index == sample_doc) %>%
select(source, title_text) %>%
htmlTable::htmlTable(align="l", rnames=FALSE, header = c("Source", "Title + Body"))
| Source | Title + Body |
|---|---|
| FOCUS Online | Wahlplakate: AfD geht mit Belohnungen gegen Zerstörung vor - FOCUS Online Montag, 28.08.2017, 14:08 Speziell in Stadtvierteln nahe der Münchner Innenstadt werden nach Angaben der AfD bis zu 80 Prozent der Wahlplakate zerstört. In manchen Straßen sei ein Komplettverlust sämtlicher Plakate durch Diebstahl oder Zerstörung zu beklagen. Insgesamt stehe für Hinweise auf die Ergreifung von Tätern ein Budget von 20 000 Euro zur Verfügung, teilten die AfD-Kreisverbände am Montag mit. “Der Kreisverband arbeitet dabei mit verschiedenen Sicherheitsunternehmen zusammen und wird ihren Mitarbeitern 400 Euro für jeden überstellten Straftäter auszahlen“, sagt AfD -Bundestagskandidat Florian Jäger. Der Kreisverband Dachau-Fürstenfeldbruck gibt an, dass 70 bis 80 Prozent seiner Wahlplakate zerstört worden seien. In manchen Ortsteilen seien sogar alle Plakate gestohlen oder zerstört worden. Dabei entstehe ein finanzieller Schaden. Es müssten nicht nur neue Plakate angeschafft werden, auch die Fahrt- und Personalkosten für die neue Plakatierung seien nicht unerheblich. Die AfDkündigte an, gegen die Plakatzerstörer juristisch vorgehen zu wollen. Neben einer Anzeige wegen Diebstahls oder Sachbeschädigung solle von jedem Täter Schadenersatz eingefordert werden. Im Video: Wahlplakate im Umfrage-Test FOCUS Online/Wochit Einen Monat vor der Wahl punktet bei den Unentschlossenen besonders eine Partei |
The expected proportion of the corpus that belongs to each topic is used to get an initial overview of the results. The figure below displays the topics ordered by their expected frequency across the corpus. The four most frequent words in each topic are used as a label for that topic.
overall_freq <- as.data.frame(colMeans(stmOut$theta)) %>%
transmute(
topic = as.numeric(rownames(.)),
frequency = colMeans(stmOut$theta)
) %>%
left_join(., topics.df, by = "topic") %>%
arrange(desc(frequency))%>%
mutate(order = row_number())
overall_freq %>%
ggplot(aes(reorder(joint_label, -order), frequency)) +
geom_col(alpha = 0.8) +
coord_flip() +
theme_hc() +
labs(x=NULL, y=NULL)
ggsave("../figs/topic_proportion.png", height = 6, width = 4)
Agendas were measured in terms of percentage distributions across the 80 topics. For each source the average distribution of each topic is calculated for each month. The following pictures show the overall topic distribution.
# calculate topic mean by source and month
topicmean <- theta %>%
mutate(
year = lubridate::year(date),
month = lubridate::month(date)
) %>%
group_by(topic,source, month, year) %>%
dplyr::summarise(topicmean = mean(theta)) %>%
ungroup() %>%
spread(source, topicmean) %>%
filter(month != 3)
topicmean_news <- theta %>%
filter(type == "news") %>%
group_by(topic,joint_label, source) %>%
summarise(topicmean = mean(theta)) %>%
ungroup()
topicmean_press <- theta %>%
filter(type == "press") %>%
group_by(topic,joint_label, source) %>%
summarise(topicmean = mean(theta)) %>%
ungroup()
topicmean_news %>%
ggplot(aes(reorder(joint_label,desc(topic)),topicmean)) +
geom_col(fill="#0099c6", alpha = 0.8) +
coord_flip() +
theme_hc() +
facet_grid(~source) +
labs(x=NULL, y=NULL, title="Average distribution of topics",
subtitle = "Online news") +
theme(axis.text.x = element_text(size = 6))
ggsave("../figs/topic_proportion_news.png", width = 11, height =10)
topicmean_press %>%
filter(topic != 14) %>%
ggplot(aes(reorder(joint_label,desc(topic)),topicmean)) +
geom_col(fill="#0099c6",alpha=0.8) +
coord_flip() +
theme_hc() +
facet_grid(~source) +
labs(x=NULL, y=NULL, title="Average distribution of topics",
subtitle = "Press releases"
) +
theme(axis.text.x = element_text(size = 6))
ggsave("../figs/topic_proportion_press.png", width = 11, height =10)
Then, we estimated bivariate correlations between party agendas and the mediated party agendas in the online news. These correlations represent the agenda selectivity each party experiences in each media outlet. The higher the correlation, the more congruent both agendas are.
media <- unique(model_df %>% filter(type == "news") %>% select(source))
parties <- unique(model_df %>% filter(type == "press") %>% select(source))
rm(corrDF)
for (i in parties$source) {
tempdf <- topicmean %>%
group_by(month, year) %>%
do(data.frame(Cor=t(cor(.[,media$source], .[,i])))) %>%
gather(medium, cor, 3:9) %>%
mutate(party = i,
medium = gsub("Cor.","",medium)) %>%
ungroup()
if (exists("corrDF")){
corrDF <- rbind(corrDF,tempdf)
} else {
corrDF <- tempdf
}
}
agenda <- corrDF %>%
mutate(date = as.Date(paste0(year,"/",month,"/1"))) %>%
dplyr::mutate(medium = ifelse(medium == "DIE.WELT", "DIE WELT", medium),
medium = ifelse(medium == "ZEIT.ONLINE", "ZEIT ONLINE", medium),
medium = ifelse(medium == "FOCUS.Online", "FOCUS Online", medium),
medium = ifelse(medium == "SPIEGEL.ONLINE", "SPIEGEL ONLINE", medium)
)
normalize_data <- function(x) {
# normalize data between -1,1
if (is.numeric(x)) {
y <- 2*((x - min(x, na.rm = T)) / (max(x, na.rm = T) - min(x, na.rm = T)))-1
return(y)
} else {
return(x)
}
}
p <- agenda %>%
mutate(
date =as.Date(paste("01",month,year, sep = "-"), format="%d-%m-%Y")
) %>%
ggplot(aes(date, cor, color = medium, group = medium)) +
geom_line(show.legend = F) +
geom_hline(yintercept = 0, size = 0.3, color = "grey30", linetype = 2) +
facet_wrap(~party) +
labs(y=NULL, x =NULL)
# guides(colour = guide_legend(nrow = 1)) +
# theme(legend.position = "bottom",
# legend.title = element_blank())
plotly::ggplotly(p)
agenda %>%
group_by(party, medium) %>%
summarize(cor = mean(cor, na.rm = T)) %>%
spread(key = party, value = cor) %>%
ggiraphExtra::ggRadar(aes(color = medium),
interactive = T,
alpha = 0,
rescale = F,
legend.position = "bottom")